CHAPTER 23 Survival Regression 343
The baseline survival function’s table may have hundreds of rows for large data
sets, so instead of printing it, you should save the table as a data file. Then, you
can use it to generate a customized prognosis curve (described in the next section)
for any specific set of values for the predictor variables.
The software may also offer a graph of the baseline survival function. If your soft-
ware is using an average-participant baseline (see the earlier section, “The steps
to perform a PH regression”), this graph is useful as an indicator of the entire
group’s overall survival. But if your software uses a zero-participant baseline, the
curve is not helpful.
How Long Have I Got, Doc? Constructing
Prognosis Curves
A primary reason to use regression analysis is to predict outcomes from any par-
ticular set of predictor values. For survival analysis, you can use the regression
coefficients from a PH regression along with the baseline survival curve to con-
struct an expected survival (prognosis) curve for any set of predictor values.
Suppose that you’re survival time (from diagnosis to death) for a group of cancer
patients in which the predictors are age, tumor stage, and tumor grade at the time
of diagnosis. You’d run a PH regression on your data and have the program gen-
erate the baseline survival curve as a table of times and survival probabilities.
After that, whenever a patient is newly diagnosed with cancer, you can take that
person’s age, stage, and grade, and generate an expected survival curve tailored
for that particular patient. (The patient may not want to see it, but at least it could
be done.)
You’ll probably have to do these calculations outside of the software that you use
for the survival regression, but the calculations aren’t difficult and can be done in
a Microsoft Excel spreadsheet. The example in the following sections uses the
small set of sample data that’s preloaded into the online calculator for PH regres-
sion at https://statpages.info/prophaz.html. This particular example has
only one predictor, but the basic idea extends to multiple predictors.
Obtaining the necessary output
Figure 23-6 shows the output from the built-in example (omitting the Iteration
History and Overall Model Fit sections). Pretend that this model represents sur-
vival, in years, as a function of age for patients just diagnosed with some partic-
ular disease. In the output, the age variable is called Variable 1.